首页> 外文OA文献 >Analysis of gradient descent methods with non-diminishing, bounded errors
【2h】

Analysis of gradient descent methods with non-diminishing, bounded errors

机译:具有非递减,有界的梯度下降方法的分析   错误

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The main aim of this paper is to provide an analysis of gradient descent (GD)algorithms with gradient errors that do not necessarily vanish, asymptotically.In particular, sufficient conditions are presented for both stability (almostsure boundedness of the iterates) and convergence of GD with bounded,(possibly) non-diminishing gradient errors. In addition to ensuring stability,such an algorithm is shown to converge to a small neighborhood of the minimumset, which depends on the gradient errors. It is worth noting that the mainresult of this paper can be used to show that GD with asymptotically vanishingerrors indeed converges to the minimum set. The results presented herein arenot only more general when compared to previous results, but our analysis of GDwith errors is new to the literature to the best of our knowledge. Our workextends the contributions of Mangasarian & Solodov, Bertsekas & Tsitsiklis andTadic & Doucet. Using our framework, a simple yet effective implementation ofGD using simultaneous perturbation stochastic approximations (SP SA), withconstant sensitivity parameters, is presented. Another important improvementover many previous results is that there are no `additional' restrictionsimposed on the step-sizes. In machine learning applications where step-sizesare related to learning rates, our assumptions, unlike those of other papers,do not affect these learning rates. Finally, we present experimental results tovalidate our theory.
机译:本文的主要目的是提供梯度下降(GD)算法,该算法具有不一定会渐近消失的梯度误差,尤其是为GD的稳定性(迭代的几乎确定有界性)和收敛提供了充分的条件具有有限的(可能)不减小的梯度误差。除了确保稳定性之外,这种算法还显示收敛于最小值集的一小部分邻域,这取决于梯度误差。值得注意的是,本文的主要结果可以用来证明具有渐近消失误差的GD确实收敛到最小集。与以前的结果相比,本文介绍的结果不仅更笼统,而且据我们所知,我们对带错误的GD的分析是文献中的新内容。我们的工作扩展了Mangasarian&Solodov,Bertsekas&Tsitsiklis和Tadic&Doucet的贡献。使用我们的框架,提出了一种简单而有效的GD算法,该算法使用具有恒定灵敏度参数的同时扰动随机逼近(SP SA)。对许多先前结果的另一个重要改进是,步长没有施加“附加”限制。在步长与学习率相关的机器学习应用中,与其他论文的假设不同,我们的假设不会影响这些学习率。最后,我们提出实验结果以验证我们的理论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号